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We explore a replicator-mutator model of the repeated Prisoner’s Dilemma involving three 
strategies: always cooperate (ALLC), always defect (ALLD), and tit-for-tat (TFT). The dynam¬ 
ics resulting from single unidirectional mutations are considered, with detailed results presented 
for the mutations TFT ^ ALLC and ALLD ^ ALLC. For certain combinations of parameters, 
given by the mutation rate g and the complexity cost c of playing tit-for-tat, we find that the 
population settles into limit cycle oscillations, with the relative abundance of ALLC, ALLD, 
and TFT cycling periodically. Surprisingly, these oscillations can occur for unidirectional mu¬ 
tations between any two strategies. In each case, the limit cycles are created and destroyed by 
supercritical Hopf and homoclinic bifurcations, organized by a Bogdanov-Takens bifurcation. 
Our results suggest that stable oscillations are a robust aspect of a world of ALLC, ALLD, 
and costly TFT; the existence of cycles does not depend on the details of assumptions of how 
mutation is implemented. 

Keywords: replicator-mutator model, evolutionary game theory, bifurcation analysis, limit cycle 


1. Introduction 


Cooperation, where individuals pay costs to benefit others, is a cornerstone of human civilization. By 
cooperating, people create value and thus increase the “size of the pie.” This makes a group in which 
everyone cooperates better off than a group where everyone is selfish. Cooperation can be difficult to 
achieve, however, because creating that collective benefit is often individually costly. How, then, could the 
selfish process of natural selection give rise to such altruistic cooperation? Evolutionary game theorists 


have devoted a great deal of effort to answering this question Axelrod, 1984; Boyd et a/., 2003; Boyd & 


2002; 

Helbing &; Yu, 

2009; IManapat et al., 20121 Mayl 1987 

; McNamara et al., 2008; Nakamaru et al., 1997 

Nowa 

4 20061 Nowak et al. 

, 2004; IPanchanathan &; Boyd, 

2004; Rand & Nowak, 2012, 2013; Rand et al. 

2013; 

Szolnoki et al. 

, 20091 

Szolnoki &; Szabo, 2004; Tarnita et al., 2009; Traulsen &; Nowak, 20061 Trivers 

1971; 

Wedekind & Milinski 

, 2010 . 
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The standard game-theoretic paradigm for studying cooperation is the Prisoner’s Dilemma [Rapoport 


1965| . Two players simultaneously choose to either cooperate (C) or defect (D), and each receives a payoff 


depending on the two choices. Two cooperators both earn the reward of mutual aid i?, while two defectors 
receive a punishment P. If one player cooperates while the other defects, the defector earns the temptation 
payoff T while the cooperator receives the sucker payoff S. If the relationship T > R > P > S holds, 
the game is a Prisoner’s Dilemma because mutual cooperation is better than mutual defection {R > p), 
but individually defectors always out-earn cooperators (T > i? if the partner cooperates, and P > S if 
the partner defects). Thus the Prisoner’s Dilemma captures the tension between individual and collective 
interests, the conundrum at the heart of cooperation. 

To study the evolution of cooperation, evolutionary game theorists typically combine game theory 
with differential equations to create an evolutionary dynamic. The replicator equation is one of the most 
common such models [Hofbaner et al , 1979| : strategies with above-average payoffs become more common 
over time while strategies with below-average payoffs become less common. As described above, defectors 
always out-earn cooperators. Thus in a simple world of pure cooperators versus pure defectors, evolution 
via the replicator equation always leads to the extinction of cooperation and a population made up solely 
of defectors. 

How, then, do we explain the success of cooperation which is so evident in the world around us? 
Numerous mechanisms for the evolution of cooperation have been propose d [Nowak , 2006 , and empirical 


evidence has been provided for their importance in human cooperation [Rand fc Nowak , 2013 . Chief 


among these mechanisms is direct reciprocity, also known as reciprocal altruism Axelrod, 1984; Nowak 


&; Sigmund, 1992, 1993; Rand et a/., 2009; Trivers, 1971| : when agents interact repeatedly, evolution can 


favor cooperation. If I will only cooperate with you in the next period if you cooperate with me in the 
current period, cooperation can be the payoff-maximizing strategy (as long as the game continues to the 
next period with high enough probability). 

Tit-for-tat (TFT) is the most well-known of these reciprocal strategies. TFT begins by cooperating, 
and then merely copies its opponent’s move in the previous period. Thus cooperators receive cooperation 
and profit, whereas defectors receive defection and lose out. Moreover, a population in which everyone 
plays TFT is resistant to invasion by defectors. If a lone defector is introduced into such a population, that 
ALLD player receives a low payoff relative to the resident TFT players. As a result, selection disfavors the 
invader and TFT is evolutionarily stable: repeated interactions promote the evolution of cooperation. 

Tit-for-tat, however, has an Achilles’ heel Nowak, 2006| : in some situations it can be invaded by kinder, 
gentler strategies [Boyd fc Lorberbanm , 1987 


Imhof et a/., 2005; van Veelen & Garcfa, 2010; van Veelen 


et a/., 2012| . To see this, imagine again a population where everyone plays TFT, except for one player 


who always cooperates (ALLC). Both the resident TFT players and the ALLC deviant cooperate in every 
round, and thus receive equal payoffs. This means that neutral drift (in the sense of population genetics) 
can allow ALLC to increase in frequency, a process called neutral invasion. A more serious weakness of 
the TFT players is that other factors can impose costs on them - costs which ALLC players avoid. For 
example, TFT is a more sophisticated strategy than ALLC, and hence may incur complexity costs. These 
costs arise because TFT needs to spend energy interpreting the other player’s last move before it can 
respond appropriately, whereas ALLC is an unconditional strategy. On top of that, the vindictiveness of 
TFT can hurt it in noisy or error-prone environments. If players sometimes make mistakes and defect when 
they meant to cooperate, such a mistake would go unnoticed by ALLC, but would send two TFT players 
into a vendetta of retaliatory defections. 

In the presence of such costs, the TFT residents are ultimately overtaken by the ALLC invaders. And 
there’s the rub: once ALLC becomes sufficiently common at TFT’s expense, it opens the door to invasion by 
nasty, uncooperative ALLD players, who can ruthlessly exploit ALLC and sweep through the population. 
Through this sort of scenario, TFT populations can eventually wind up succumbing to defection, suggesting 
that cooperation may be doomed even in repeated games. 

Here we present a solution to this problem: we show that incorporating mutation into the replicator 
equation breaks ALLD’s dominance over the evolutionary outcomes by giving rise to stable cycles involving 
substantial levels of cooperation. Surprisingly, it is not necessary for every strategy to mutate into every 
other strategy in order to get cyclical behavior, or even for ALLD to mutate into TFT. We find that adding 
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any single unidirectional pathway for mutation can lead to stable cycles. Thus we provide evidence that 
stable oscillations are a robust aspect of a world of ALLD, ALLC and costly TFT players, and that direct 
reciprocity can lead to substantial cooperation even in the face of invasion by unconditional cooperators. 


1.1. Relation to previous work 


To allow for mutation, the relevant mathematical setting needs to change from the replicator equation to 
the replicator-mutator equation [Nowak , 2006; Stadler &; Schuster, 1992 . A number of previous authors 


have studied limit cycles in replicator-mutator equations, motivated by applications to language change 
[Mitchener fc Nowak , 2004] , autocatalytic chemical reaction networks [Stadler fc Schnsto , 1992 , evolu- 


tionary ga mes [Bladon et d. , 2010; Galla, 2011; Imhof et al] 2005[ , and multi-agent decision making [R 


&; Leonard, 2011; Pais et al 


2012 


ais 


Our work is most closely related to that of Imhof et al [2005]. They studied the evolutionary game 
dynamics of ALLD, ALLC, and TFT for finite populations, where stochastic effects become important. One 
of their most striking results was the observation of the phenomenon mentioned above - an evolutionary 
cycle that goes from ALLD to TFT to ALLC and back to ALLD again. Even more remarkably, they found 
that the cycle spends nearly all its time lingering in the vicinity of TFT, even though ALLD is a strict 
Nash equilibrium! 

In the model considered by Imhof et al [2005], the conditional strategy TFT was assumed to incur a 
complexity cost c, relative to the simpler unconditional strategies ALLC and ALLD. Mutation was assumed 
to be uniform: every strategy mutates to the other two with equal probability. More precisely, if i ^ j, 
strategy i mutates to strategy j with probability p and stays the same with complementary probability 


1 -2/i. 

For the deterministic case of infinitely large populations, Imhof et al [2005] found that for certain com¬ 
binations of parameters c and /i, the replicator-mutator equations have a stable limit cycle, corresponding 
to the evolutionary cycles observed in their simulations. They mentioned bifurcations associated with the 
birth and death of the limit cycle, but did not present a bifurcation analysis or a stability diagram to locate 
the bifurcation curves in the (/i, c) parameter space. 

We were curious to learn more about the evolutionary cycle seen in the infinite population model. 
What bifurcations create and destroy this limit cycle? How does its bifurcation structure depend on the 
details of how the mutations are implemented, in a graph-theoretic sense? That is, if we think of the three 
strategies as the vertices of a triangle graph, with mutations occurring along the edges between them, the 
uniform mutation case studied by Imhof et al [2005] amounts to a complete graph with equal weights p 
on its six directed edges. What would happen, by contrast, in the opposite extreme case of unidirectional 
mutation along one of these six directed edges? 

We found that for all six possible unidirectional mutations, a stable limit cycle exists in a certain part 
of (/i, c) space. The cycle always oscillates in the same rotational sense, moving from the neighborhood of 
ALLD to TFT to ALLC and back toward ALLD again, just as it does in the case of uniform mutation. 
Moreover, the commonalities extend to the types and locations of the bifurcations that create and destroy 
the cycle. The region in parameter space where stable limit cycles exist is always bounded on one side by a 
curve of supercritical Hopf bifurcations and on the other side by a curve of homoclinic bifurcations. These 
two curves meet tangentially at a Bogdanov-Takens point at one end of the stability region for the limit 
cycle. All of these statements are true of the uniform mutation case as well. 

This paper is organized as follows. In the next section we review the formulation of the Pris¬ 
oner’s Dilemma and its associated replicator equation, followed by their generalization to the replicator- 
mutator equation. Then we focus on two special cases of unidirectional mutation, TFT ^ ALLC and 
ALLD ^ ALLC, and summarize the results from the other four cases, as well as for the case of uniform 
mutation. The paper concludes with a conjecture and a brief discussion. 
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2. Model 

2.1. Prisoner’s Dilemma 

Following Axelrod [1984], we fix the parameter values T = 5, R = 3, P = 1, S = 0. These satisfy the 
inequalities required for the game to qualify as a Prisoner’s Dilemma: T > R > P > S and R > {T + S) /2. 
The final inequality implies that if the two players play many rounds with one another, it is better for 
both of them to cooperate all the time rather than engage in alternating bouts of getting suckered and 
suckering the other. 

Now consider a repeated Prisoner’s Dilemma among players using the strategies ALLC, ALLD, and 
TFT. Then, in the limit where the players meet infinitely often and there is no discounting of future 
interactions, their average payoffs are given by the payoff matrix given in Table 1 (the entries show the 
average payoff that the row player gets when playing against the specified column player): 

Table 1. Payoff matrix of repeated 
prisoner’s dilemma 



ALLD 

TFT 

ALLC 

ALLD 

P 

P 

T 

TFT 

P 

R 

R 

ALLC 

S 

R 

R 


For example, an ALLC player gets suckered every time against an ALLD player, and therefore receives an 
average payoff of S in their depressing encounters. But when TFT plays ALLD, it gets suckered only on 
the first round, but after that it reciprocates each defection with defection, leading to an infinitely long 
string of mutual defections and hence an average payoff of P. 

2.2. Replicator equation 

Next we set the game in an evolutionary framework. If each strategy reproduces at a rate proportional to 
its relative fitness, the resulting dynamics can be approximated by the following set of ordinary differential 
equations, known as the replicator equations: 


X = x{fx-(t>) 
y^y{fy-(p) 

z = z{f^-(f)). (1) 

Here x^y and z denote the fractions of the population playing ALLD, TFT, and ALLC, respectively; fi is 
the fitness of strategy i, defined as its expected payoff against the current mix of strategies; and 

4> = xfx + yfy + zfz (2) 

is the average fitness in the whole population. By summing the differential equations for x, and i, one 
can verify that x + ^ + z = 1 for all time, as required by the definition of x, y and z as relative frequencies. 
The payoff matrix for the repeated Prisoner’s Dilemma implies that the fitnesses are given by 

fx = xP + yP + zT 
fy — xP T yR T zR 

fz = xS + yR + zR. (3) 

For the parameter values T = 5,i? = 3,P=l, and S — 3 assumed above, and by replacing z with 1 — x — y, 
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we find that the fitnesses reduce to 


/a, = 5 - 4a; - 4y 
fy = 3- 2x 

= (4) 


and the average fitness in the population becomes 0 = 3 — a;(l + a; + 3y). 

The phase portrait for the replicator equation 0 with fitnesses 0 can be drawn in the (x, y) plane, 
by eliminating z via z — 1 — x — y. Unfortunately this way of presenting the phase portrait has certain 
disadvantages. It distorts the geometry of the trajectories and it arbitrarily highlights two of the strategies 
at the expense of the third. In the original (x, z) phase space, the phase portrait lives on the equilateral 
triangle defined by the face of the simplex x + y + z — where 0 < x, 2 ; < 1. Thus, a more appealing 
and symmetrical approach is to show the phase portrait as it actually appears on simplex. The following 
change of variables achieves this goal: 



( 5 ) 


Equation Q can be shown to be equivalent, up to a uniform scaling and a translation, to the change of 
variables given by Eq. 35 in Wesson & Rand [2013]. 

Figure [ 1 ] plots the resulting phase portrait. The first thing to note is that it contains a saddle point 
at {x^y^z) = (1,0,0), corresponding to the entire population at ALLD. The fact that the ALLD corner 
is a saddle point, rather than a stable fixed point, reflects our simplifying assumptions that the game is 
infinitely repeated with no discounting of future interactions; in effect, the first round of the game is totally 
ignored. Without these assumptions, a standard result in repeated games [Nowak , 2006 is that there is 
bistability along the ALLD-TFT edge. In that case the ALLD corner would have a basin of attraction 
whose size depends on the payoffs. 


y = 1 (TFT) 



Fig. 1. Phase portrait of system 0 with c = 0 and /x = 0. 


Second, observe that Fig. [T] displays a line of neutrally stable fixed points along the side joining ALLC 
to TFT, where ALLD is absent. Neutral drift takes place along this side in finite populations. In the infinite 
population case shown here, the system almost always ends up in a nirvana of cooperation, with a mix of 
TFT and ALLC determined by the initial conditions, and with ALLD extinct. 

However, the line of neutrally stable fixed points in this phase portrait is a structurally unstable feature. 
If the governing equations are perturbed by the addition of arbitrarily small terms, one expects that the 
line of fixed points will break and be replaced by something qualitatively different. 
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Table 2. Payoff matrix of replicator 
model with cost. 



ALLD 

TFT 

ALLC 

ALLD 

P 

P 

T 

TFT 

P-c 

R — c 

R — c 

ALLC 

S 

R 

R 


Indeed, when we associate a small complexity cost c to playing TFT, the payoff matrix changes to 
that shown in Table 2: 

The new fitnesses become 


and 


/a: = 5 - 4x - 4y 


— ?> — c — 2x 


CO 

1 

CO 

(6) 

cy - x(l + x + 3y). 

(7) 


In the corresponding phase portraits, shown in Fig. the previous line of neutrally stable fixed points 
turns into an invariant line with the vector field flowing from a saddle point at TFT to another saddle 
point at ALLC. 


y = 1 (TFT) y = 1 (TFT) y = 1 (TFT) 





(a) 0 < c < ? (b) ? < c < 2 (c) c > 2 

Fig. 2. Phase portrait of system 0 with c > 0 and ju = 0. 


The structure of the rest of the phase portrait depends on the size of c, as shown in Fig.[^ but the long-term 
behavior of the system is the same in all three cases: ALLD takes over and cooperation dies out. 

Notice that none of the phase portraits so far contain any limit cycles. This could have been anticipated 
from a general theorem that forbids limit cycles in any system of replicator equations involving n = 3 
strategies, for any game and any payoff matrix [Hofbaner fc Sigmund , 1998| . Periodic solutions can exist, 
but only within continuous families of neutrally stable cycles. Such periodic orbits are not isolated and 
hence do not qualify as limit cycles. 


3. Replicator-Mutator Equation 

Limit cycles do, however, become possible when we allow ALLD, ALLC, and TFT to mutate into one 
another. For simplicity, let us restrict attention to single unidirectional mutations only. Then there are six 
possibilities. 
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3.1. Example 1: TFT ^ ALLC 

In this case, we assume that after replication occurs, a player with strategy TFT mutates to ALLC with 
probability p. Then the replicator-mutator system is 

X = x{fx-(/)) 

vivify -(!>)- pyfy- ( 8 ) 

where and 2 ; denote the frequencies of ALLD, TFT, and ALLC, respectively. Note that as before, we 
have chosen to eliminate z from the equations by using the identity z = 1 — x — y. Likewise, we do not 
write the equation for i explicitly, since it can be obtained from x and y if needed via i = —x — y. 

After insertion of (§ and 0 into 0, the replicator-mutator system becomes 

i = X [(c — 4) ?/ -h X (x -h 3^ — 3) + 2], 

y ^ y[c{iJi + y — 1) — ?>iJi + X (2/i + x + ?>y — 1)]. (9) 


The right hand side is cubic in x and y. With the help of computer algebra, one can calculate the fixed 
points for the system, all but one of which lie on the boundary of the simplex (the equilateral triangle). 
Explicit formulas for these fixed points are presented in Appendix A. We have also calculated the curves 
in (/i, c) space at which the interior fixed point undergoes saddle-node and supercritical Hopf bifurcations; 
see Appendix A for details. As the parameters are varied, the stable limit cycles that are created by the 
supercritical Hopf bifurcation are later destroyed by a homoclinic bifurcation. The associated curve of 
homoclinic bifurcations was computed numerically with the help of the continuation package MATCONT 
[Govaerts fc Kuznetsov , 


2008 


Figure plots the bifurcation curves in (/i, c) space. The saddle-node curve is shown in blue, the Hopf 
curve in red, and the homoclinic curve in green. These curves partition the parameter space into four 
regions, marked 1, 2, 3, and 4 on the figure. 


0.7 
0.6 
0.5 
0.4 

S.3 
0.2 
0.1 
0.0 

0.00 0.05 0.10 0.15 0.20 

y 

Fig. 3. Stability diagram in (/i, c) space when TFT ^ ALLC. Saddle-node curve, blue; supercritical Hopf curve, red; homo¬ 
clinic curve, green; BT, Bogdanov-Takens point. 



Figure shows representative phase portraits for each of the four regions. In region 1, ALLD wipes 
out the other two strategies. (This makes sense intuitively. In region 1, TFT pays a high cost in fitness 
compared to the other two strategies. Furthermore, because p is large in region 1, TFT mutates rapidly into 
ALLC, which in turn is clobbered by ALLD.) In region 2, a new pair of fixed points exist; they were created 
in a saddle-node bifurcation on the right-hand boundary of the equilateral triangle when the parameters 
crossed the saddle-node bifurcation curve. The unstable spiral seen in region 2 is the descendant of that 
node. Despite the existence of these two new fixed points, the system’s long-term behavior remains the 
same as in region 1: almost all trajectories approach ALLD. 
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y = 1 (TFT) y = 1 (TFT) 




(a) Region 1 


(b) Region 2 


y = 1 (TFT) 



y = 1 (TFT) 



(c) Region 3 (d) Region 4 

Fig. 4. Phase portrait of system § in different regions of (ja,c) space when TFT ^ ALLC. The stable limit cycle is shown 
in red. 


In region 3, however, a new attractor - a stable limit cycle, shown in red in Fig. 4(c) - coexists with 
ALLD. Where did this limit cycle come from? It emerged from a homoclinic orbit. When the parameters 
lie on the homoclinic bifurcation curve between region 2 and region 3, a homoclinic orbit starts and ends 
at the newly created saddle (the one close to the side between TFT and ALLD). This homoclinic orbit 
becomes a stable limit cycle when the parameters lie in region 3. 

Finally, as we move from region 3 toward region 4, the limit cycle shrinks and ultimately becomes a 
point (a stable spiral) at the supercritical Hopf curve (the red curve in Fig. |^. When we move into the 
interior of region 4 the system becomes bistable, with the stable spiral sharing the state space with ALLD. 

In biological terms, the population displays substantial levels of cooperation when it is on either the 
stable limit cycle of region 3 or the stable spiral of region 4. 


3.2. Example 2: ALLD ALLC 

Now we consider an alternative scenario of unidirectional mutation. Suppose that after replication occurs, 
each player with strategy ALLD mutates into ALLC with probability /i. Notice that this example shares 
one potentially important feature with Example 1 - in both cases, the target mutant being created is ALLC. 
On the other hand, this example differs from Example 1 in that the source of the mutant is now ALLD, 
not TFT. Which matters more: the commonality of the target or the non-commonality of its source? 

It turns out that the commonality of the target is more important. To see this, observe that in the 
presence of ALLD ^ ALLC mutations, the replicator-mutator system becomes 

X = x{fx-(l)) - tixfx, 

y^y{fy-(p) 


( 10 ) 
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and z = 1 — X — where x, and z again denote the frequencies of ALLD, TFT, and ALLC, respectively. 
After insertion of and 0 into ( [To| ), the replicator-mutator system yields 

X = x[{c - A) y - bp -\- Ajx {x -\- y) + X {x -\- 3y - 3) -\- 2], 

y = y[c{y - 1) ^ X {x + 3y - 1)]. (11) 

Figure plots the stability diagram in (/i, c) space. Note how much it resembles Fig. In both cases, 
the parameter space divides into four regions bounded by curves of supercritical Hopf (red), saddle-node 
(blue) and homoclinic (green) bifurcations, all of which emerge from a Bogdanov-Takens point. The main 
difference is that the red curve of Hopf bifurcations goes through the origin in Fig. [^and not in Fig. 



Fig. 5. Stability diagram in (/x, c) space when ALLD ^ ALLC. Saddle-node curve, blue; supercritical Hopf curve, red; 
homoclinic curve, green; BT, Bogdanov-Takens point. Formulas for the Hopf and saddle-node bifurcation curves are given in 
Appendix B. The homoclinic curve was computed using MATCONT. 

Figure shows the phase portraits corresponding to the four regions. As in Example 1, defection 
dominates the long-term dynamics in regions 1 and 2, where a stable fixed point close to ALLD attracts 
almost all solutions. (Note that pure ALLD is no longer a fixed point, because of the assumed mutations 
from ALLD to ALLC. That is why the globally attracting fixed point lies between ALLD and ALLC. We 
will refer to this fixed point as “almost ALLD.”) In region 3, a stable limit cycle (shown in red) coexists 
with almost ALLD. In region 4, the limit cycle no longer exists; it has contracted to a stable spiral fixed 
point via the supercritical Hopf bifurcation between regions 3 and 4. 

It is important to realize that the population exhibits large amounts of cooperation when it is in the 
stable spiral state or cycling periodically. This becomes clear when one looks at time series instead of 
phase portraits. Figure shows the approach to a stable limit cycle in which ALLD is nearly absent except 
during brief spikes, whereas ALLC and TFT predominate for most of the cycle. Thus, the stable limit cycle 
signifies more than just an avoidance of all-out defection; it represents a state of significant cooperation. 

3.3. Other single unidirectional mutations 

Four other types of single unidirectional mutations are possible: ALLD ^ TFT, ALLC ^ TFT, ALLC ^ 
ALLD, and TFT ^ ALLD. Using the techniques above, we have analyzed these remaining cases completely. 
But rather than wade through the details, it seems clearer and more useful to summarize the main results, 
which are as follows. 

For all four cases, the phase portraits and the bifurcation curves are more complicated than those 
presented in Examples 1 and 2. Nevertheless, all of them display stable limit cycles in some region of the 
parameter space. Specifically, the region in (/i, c) space where stable limit cycles exist is always bounded 
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y = 1 (TFT) 7 = 1 (TFT) 




(a) Region 1 (b) Region 2 


y = 1 (TFT) y = 1 (TFT) 




(c) Region 3 (d) Region 4 

Fig. 6. Phase portrait of system 0 in different regions of (ju, c) space when ALLD ^ ALLC. The stable limit cycle is shown 
in red. 


on one side by a curve of supercritical Hopf bifurcations, and on the other side by a curve of homoclinic 
bifurcations. Both curves emanate from a Bogdanov-Takens point. 

What we find most intriguing is that the homoclinic bifurcation curve - the counterpart of the green 
curve in Figs. [^and[^- always passes through the origin (/i,c) = (0,0). Hence stable limit cycles always 
exist for arbitrarily small perturbations of the original system 0 with fitnesses Q, no matter how the 
mutation is implemented. 

This is the sense in which limit cycles are “sparked by mutation” in the Prisoner’s Dilemma among 
ALLC, ALLD, and TFT. If just one of the strategies can mutate into just one of the others, it takes only 
an infinitesimal cost c and an infinitesimal mutation probability fi for the system to display self-sustained 
oscillations, with large amounts of cooperation during part of the cycle. In this way, the slightest bit of 
mutation allows the population to avoid a collapse into all-out defection. 


3.4. Uniform global mutation 

So far we have focused on a very restricted class of mutation pathways: single unidirectional mutations, 
in which exactly one strategy mutates into exactly one other. But we suspect that the sparking of limit 
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t 

(a) ALLD 


t 

(b) TFT 



t 


(c) ALLC 

Fig. 7. Time series for a solution of system 0 as it approaches the stable limit cycle. Notice that the level of cooperation 

is high during much of the cycle, as shown by the high combined levels of ALLC and TFT, while ALLD remains low except 

for brief spikes. Parameter values: (/x, c) = (0.08,0.04). 

cycles by mutation is more general. 

For example, consider the extreme opposite case of uniform global mutation, where each strategy 
mutates to the other two with probability /i, and hence stays the same with probability 1 — 2/i. The 
replicator-mutator equations for this case are given by 

X = x[fx{l- 2n) - ^] + nfyV + jifzZ 

y = y [/y (1 - 2/i) -<t>] + jxfxx + nfzZ (12) 

where z = 1 — x — y. Substitution of (§ and 0 into yields 

X = il[x {llx + 9y — 16) — cy] + x [{c — A) y + x {x + 3y — 3) + 2] + 3/i, 

y ^ cy {2fi + y - 1) + 3/1 + {y - fi) + X {3y - 1) (/i + y) - 9/iy. (13) 

Figureplots the stability diagram in {n, c) space, showing the curves where supercritical Hopf (red), 
saddle-node (blue) and homoclinic (green) bifurcations take place. The diagram splits into five regions. All 
the bifurcation curves of this system were computed using MATCONT [Govaerts &; Kuznetsov, 2008]. 

Figure shows the phase portraits corresponding to the five regions. As in the earlier examples 1 
and 2, defection prevails in regions 1 and 2, where a stable fixed point close to ALLD attracts almost all 
solutions. In region 3, the possibility of cooperation reemerges: a stable limit cycle (shown in red) coexists 
with a stable node near ALLD. In region 4, the limit cycle becomes a global attractor; the stable node of 
region 3 is lost in a saddle-node bifurcation when the parameters move from region 3 to 4. The limit cycle 
no longer exists in region 5. 

3.5. Conjecture 

In every one of the examples considered so far, the region where stable limit cycles exist extends all the way 
down to the origin in parameter space. This means that limit cycles can be sparked by an arbitrarily small 
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Fig. 8. Stability diagram in (/i, c) space for the case of uniform global mutation. Saddle-node curve, blue; supercritical Hopf 
curve, red; homoclinic curve, green; BT, Bogdanov-Takens point; CP, cusp point. The bifurcation curves were computed using 
MATCONT. 

mutation probability /i, if the complexity cost c of TFT is also suitably small. These results have been 
obtained for topologies at opposite ends of the graph-theoretic spectrum: single unidirectional mutation (in 
which mutation occurs along one directed edge) and uniform global mutation (in which mutation occurs 
in both directions along the edges of the complete graph). 

We conjecture, therefore, that a similar sparking of stable limit cycles occurs for any pattern of muta¬ 
tion. To make this statement precise, we write down the general form of the replicator-mutator equation. 
Let the frequencies of the three strategies ALLD, TFT and ALLC be denoted as xi,X 2 ,X 3 rather than 
x^y^z. Then the replicator-mutator dynamics are given by 

Xi ^ - Xi(f) (14) 

for i = 1,2, 3. Here fi is the fitness of strategy i (obtained from Eq. as before), (j) = ^jfj 

average fitness in the population, and Qij is the probability that players with strategy i mutate to playing 
strategy j. Since the Qij represent probabilities, they are non-negative and satisfy Qij — ^5 hence the 
matrix Q is row stochastic. In the limiting case where mutation does not occur, Q is the identity matrix 
and Qij = 6ij where 6ij is the Kronecker delta. 

Now suppose that mutation does occur, and that it can be characterized by a mutation matrix M and 
a single parameter 0 < /i < 1 such that 


Qij = Sij - jiMij. (15) 

Any matrix M that maintains the row-stochasticity and non-negativity of Q is admissible. This diversity 
of M is what we mean by any pattern of mutation. 

Phrased in these terms, our conjecture is that for any fixed, admissible M, one can find an open set 
of parameters (/i, c) arbitrarily close to (0, 0) such that stable limit cycles exist for the replicator-mutator 
system (O- 

We have verified this conjecture informally for a handful of admissible mutation matrices M, but 
we have not done a comprehensive numerical study nor do we have analytical evidence to support the 
conjecture. So it could well be false. But we suspect it is true. A key step toward proving it would be to 
demonstrate that a curve of homoclinic bifurcations (the counterpart of the green curves in Figs. iHl and 
always emanates from the point (/i, c) = (0,0), for any admissible M. 
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y = 1 (TFT) y = 1 (TFT) 




(a) Region 1 (b) Region 2 


y = 1 (TFT) y = 1 (TFT) 




(c) Region 3 (d) Region 4 


;; = 1 (TFT) 



(e) Region 5 


Fig. 9. Phase portrait of system (13) in different regions of (/i, c) parameter space. The stable limit cycle is shown in red. 


4. Discussion 

Our results show that a variety of different mutation structures give rise to evolutionary cycles of coop¬ 
eration and defection in the repeated Prisoner’s Dilemma. These cycles appear to be a robust feature of 
interactions in well-mixed populations of ALLC, ALLD and TFT, rather than being specific to particular 
assumptions regarding the mutation matrix. These cycles, as well as the stable spirals created by mutation 
in other parts of the (/i, c) parameter space, lead to substantial levels of cooperation. Thus the grim picture 
painted by the standard replicator equation without mutation, in which ALLD dominates even in repeated 
games, may be too pessimistic. 
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The replicator equation assumes that the population is well mixed. In some settings it is more realistic 
to regard the population as spatially structured, with players interacting only with a fixed subset of others 
on a regular lattice. Evolutionary cycles of cooperation and defection among ALLC, ALLD, and TFT can 
occur in this case too, even in the absence of mutation [Szolnoki et a/. , 2009|. Similar cycles occur for 
populations playing a spatial version of the rock-paper-scissors game [Szolnoki fc Szabd , 2004| . 

Although the replicator equation is typically thought of as describing genetic evolution, it can just as 
well describe a social learning dynamic in which people imitate the strategies of more successful others. In 
this context, mutation corresponds to experimentation or innovation, i.e., trying out a strategy other than 
the one which is performing well at the moment. Experimentation is a key element of human behavior 
[Rand et a/. , 2013; Traulsen et a/., 2010 , and our results suggest that it may also help to explain the 


success of cooperation in human societies. 
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Appendix A TFT ^ ALLC 

For Example 1 of the main text, in which TFT mutates into ALLC, the system Q has at most 4 fixed 
points (x*, ^*) in the simplex x -\- y -\- z — with 0 < x, y, 2 ; < 1: 


{x\^yi) — (0,0) is a saddle, 
(^ 252 / 2 ) = (I 5 O) is a stable node. 


(^3’ 2 / 3 ) — 

{xlvD- 


Ai — 5c/i + c + 17/i + 2 —(/i + l)Ai — c/i^ + 8c/i + c + /i^ — /i + 2 


12/i + 4 


24yn T 8 


— Ai — ^cfjj -\- c fjj T 2 (yti “h 1)A]^ — “h Scfjj T c T jjp' — fi ‘2 


12/i + 4 ’ _ 24/i + 8 

unstable spiral (or node), depending on fi and c [Strogatz] 1994 , where in the formulas above. 


is a saddle, and 


, could be a stable or 


Ai — c^(m T 3)^ ~ 2c((/i — ll)/i + 6) + (/i — 28)r + 4. 


The fixed points (x^,?/|), (x^,?/^), and (xg,?/^) are on the boundary of the simplex and {xl,yl) is 
inside the simplex. A saddle-node bifurcation occurs when and (x^,^!) coalesce and disappear. 

The equation of the saddle-node bifurcation curve is 

(/i — ll)/i — 4\/6y7/^3/7+T) -h 6 

OlTsp 

The fixed point (^ 4 ,^ 4 ) undergoes a supercritical Hopf bifurcation and switches from a stable spiral 
to an unstable spiral at certain parameter values. The equation of the Hopf bifurcation curve, which was 
computed analytically, is 


(A.l) 


3 - 13u 1 I 


Aa 


+ 


A^ 


3^(/i-l)^ 3 (m-1)M4 


1 

+ 2 




A 2 


Aa 


A^ 


3(/x-1)2^4 


+ 


Af\ 


4,M7 + 


Aa 


+ 


A 5 


(A. 2 ) 

(A.3) 


3^(/i-l)2 3(/i-l)^A4 


where 


(3 - 13/^)2 35/x 2 -34^- 8 35/x2 - 34^-8 

" “ 4(m-1)2 ^ 3(/x2-2/x + 1) (m-1)2 ’ 

As = (-5240604096/i^^ - 40578465024;u^^ + 200756188800^4^° - 265354820640^4° + 60401533248/i® + 
110419920576^4^ - 55059298560^4° - 12327872544;t4° + 4987630080/ + 1779338880/ + 218439936^4^ - 
110592;t4- 1880064)5, 


A4 = (204136/ - 578832/ + 416952/ + 51238/ - 79740/ - 12984/4 + A3 - 1456)5, 


As = ^ (2272/ - 3160/ + 52l/ + 316/4 + lOO) , 

(3-13/4)3 4 (35/- 34/t - 8) (3 - 13/ 8(49/-4/t + 4) 

° (/4-1)3 (/^-1)3 (m-1)^ 

35/i^ — 34/i — 8 35/i^ — 34/i — 8 

^ ^ 3 (/i2 - 2/i + 1) (/i-l)2 


and 
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Appendix B ALLD ALLC 


For Example 2 of the main text, in which ALLD mutates into ALLC, system (10) has at most 5 fixed 
points (x*, ^*) in the simplex x -\- y -\- z = with 0 < x, y, 2 ; < 1: 

{x\^y\) — (0,0) is a saddle, 

(^ 252 / 2 ) — (O 51 ) is a saddle, 

(^ 3 , 2 / 3 ) = 4/i — y/4/i(4/i — 1 ) + 1 + 3^ , 0 ^ is a stable node, 

. ^ _ A 4c/i + As + c - 11/i + 2 8c/i^ - lOc/i + (2/i - 1)^8 + c + 18/i^ - 11/i + 2\ . 

^ ^ / I "I A 5 O / “\ \ / A "l\ |lS s^jCLcL-L^ ^ 


and 


16/i — 4 


8 (/x- l)(4/^- 1 ) 


, ^ I —4c/x -|- 74g — c 11/x — 2 Scfi^ — lOc/x — (2// — l)^g c iSuP — 11/x -|- 2 . 

(^5,^5) = I -7^^-^-7TT2-7^- I COUM bC a 


16/i-4 ’ 8 (/i-l)(4/i-1) 

stable or unstable spiral (or node), depending on y and c. In the expressions above. 

As = y/(4c/i + c - 11/i + 2)2 + 8c(4/i - l)(-c + /i + 2). 

The fixed points (x^,//^), (x^,//^), and (xg,?/!) are on the boundary of the simplex, and (x|,//|) and 
(x 5 ,// 5 ) are in the interior. A saddle-node bifurcation occurs when (x|,//|) and (x^,//^) coalesce and disap¬ 
pear. The equation of the saddle-node bifurcation curve is 


c = 


jjL (28/i — 25) + 6 — 4v^^(1 — /i)/i(3/i — 2)(4/i — 1) 


(BA) 


(3-4/i)2 

The fixed point (x^,//^) undergoes a supercritical Hopf bifurcation and switches from a stable spiral 


to an unstable spiral at the curve given by 


(1-i^) Aio ^ (l + i\/3) (48/i3-201/i2 + 114/i-33) 


6 ^ 


3 22/3yl^Q 


(B.2) 


where 


Ag = C4 (48/i3 - 201/i2 -h 114/x - 33)^ -h (-2592/i4 -h 5670/i3 - 5022/i2 + 810/i -h 162)" 


and 


^10 = y-2592/x4 + 5670/^3 _ 5022/^2 + ^9 + 810/x + 162. 
















